Performance optimization of multiple memory architectures for DSP
نویسندگان
چکیده
Multiple memory module architecture offers higher performance by providing potentially doubled memory bandwidth. Two key problems in gaining high performance in this kind of architecture are variable partitioning and scheduling. However there’s little research work that has been done on these problems. In this paper, we present a new graph model for tackling the variable partitioning problem, namely, Variable Independence Graph (VIG), which provides more precise information for variable partitioning compared to the previous graph models. We also present a scheduling algorithm, Rotation Scheduling with Variable Repartition (RSVR), that takes advantage of multiple memory modules. It’s a new scheduling technique based on retiming and software pipelining. It can re-partition the variables if necessary during the scheduling process. The experiment results show that the average improvement on schedule length by using the algorithm is 44.8%. Another major contribution of this paper is that we invented an algorithm for design space exploration on multiple memory architecture. Our approach produces more feasible solutions on a set of schedule length requirements, and less functional units in the solution for the same schedule length requirement, compared to the approach based on Interference Graph.
منابع مشابه
Ultra-Low-Energy DSP Processor Design for Many-Core Parallel Applications
Background and Objectives: Digital signal processors are widely used in energy constrained applications in which battery lifetime is a critical concern. Accordingly, designing ultra-low-energy processors is a major concern. In this work and in the first step, we propose a sub-threshold DSP processor. Methods: As our baseline architecture, we use a modified version of an existing ultra-low-power...
متن کاملCompilation techniques for high-performance embedded systems with multiple processors
Despite the progress made in developing more advanced compilers for embedded systems, programming of embedded high-performance computing systems based on Digital Signal Processors (DSPs) is still a highly skilled manual task. This is true for single-processor systems, and even more for embedded systems based on multiple DSPs. Compilers often fail to optimise existing DSP codes written in C due ...
متن کاملPreferred Strategies for Optimizing Convolution on VLIW DSP Architectures
1. Abstract Convolution is a central algorithm for implementing linear time invariant systems that constitute the heart of most digital signal processing algorithms. Performance on the linear convolution algorithm has been one of the primary benchmarks used to discern the performance of dedicated digital signal processing architectures (DSP). While DSP benchmarks are far more varied and complex...
متن کاملAn Efficient LUT Design on FPGA for Memory-Based Multiplication
An efficient Lookup Table (LUT) design for memory-based multiplier is proposed. This multiplier can be preferred in DSP computation where one of the inputs, which is filter coefficient to the multiplier, is fixed. In this design, all possible product terms of input multiplicand with the fixed coefficient are stored directly in memory. In contrast to an earlier proposition Odd Multiple Storage ...
متن کاملVLSI DSP Architectures and Applications
A data ow and control ow model is presented for use in high level synthesis of eecient time multiplexed architectures targeted towards real-time DSP systems. The model is an extension to the polyhedral models used in array synthesis techniques. The model features a mathematical description of dependencies between individual operations and signal instances of multi-dimensional signals for algori...
متن کامل